Declarative Web data extraction and annotation
نویسندگان
چکیده
We propose a software architecture for semantics-based annotation of data extracted from Web sources. Starting from the LiXto suite, which enables semi-automated extraction of XML data from regular documents, we present a solution for attaching background information to individual tags by means of so-called decorations. Decoration is carried out as an inferential activity in the formal context of Answer Set Programming. We discuss a motivating example that will serve as a validation to our approach.
منابع مشابه
OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation
Although OWL provides a solid basis for many semantic web applications, it lacks sufficient declarative semantics for instance recognition to support automated semantic annotation. This omission prevents OWL from being a satisfactory ontology language for automated semantic annotation. This problem can be solved by adding declarative instance recognition semantics to OWL. Our declarative instan...
متن کاملWeb-Style Multimedia Annotations
Annotation of multimedia resources supports a wide range of applications, ranging from associating metadata with multimedia resources or parts of these resources, to the collaborative use of multimedia resources through the act of distributed authoring and annotation of resources. Most annotation frameworks, however, are based on a closed approach, where the annotations data is limited to the a...
متن کاملData extraction and annotation based on domain-specific ontology evolution for deep web
Deep web respond to a user query result records encoded in HTML files. Data extraction and data annotation, which are important for many applications, extracts and annotates the record from the HTML pages. We proposed an domain-specific ontology based data extraction and annotation technique; we first construct mini-ontology for specific domain according to information of query interface and qu...
متن کاملInformation Extraction from Unstructured and Ungrammatical Data Sources for Semantic Annotation
The internet has become an attractive avenue for global e-business, e-learning, knowledge sharing, etc. Due to continuous increase in the volume of web content, it is not practically possible for a user to extract information by browsing and integrating data from a huge amount of web sources retrieved by the existing search engines. The semantic web technology enables advancement in information...
متن کاملAnnotation for Query Result Records based on Domain-Specific Ontology
The World Wide Web is enriched with a large collection of data, scattered in deep web databases and web pages in unstructured or semi structured formats. Recently evolving customer friendly web applications need special data extraction mechanisms to draw out the required data from these deep web, according to the end user query and populate to the output page dynamically at the fastest rate. In...
متن کامل